Goto

Collaborating Authors

 label order



Classifier Chain Networks for Multi-Label Classification

Touw, Daniel J. W., van de Velden, Michel

arXiv.org Machine Learning

In contrast to binary and multi-class classification, where each observation in the data is assigned to a single class, an observation in a multi-label classification task can have multiple labels. This type of problem arises in different fields, such as object detection in images, text analysis, bioinformatics, and recommendation systems (Tsoumakas et al., 2010). Consequently, numerous methods have been developed to handle multi-labeled outcomes. In contrast to existing methods, which often focus on modeling each outcome variable separately, our proposed method jointly models all labels to capture dependencies between them. In this study, we also refer to these dependencies between labels as label interdependencies. A frequently used method for a classification task with multi-labeled outcomes is to decompose the task into separate independent binary classifications (e.g., Boutell et al., 2004; Luaces et al., 2012). This approach is typically referred to as binary relevance. A limitation of binary relevance is the fact that it does not exploit potential correlations between the different labels (Godbole and Sarawagi, 2004; Zhang and Zhou, 2014).


On the Optimality of Classifier Chain for Multi-label Classification

Neural Information Processing Systems

To capture the interdependencies between labels in multi-label classification problems, classifier chain (CC) tries to take the multiple labels of each instance into account under a deterministic high-order Markov Chain model. Since its performance is sensitive to the choice of label order, the key issue is how to determine the optimal label order for CC. In this work, we first generalize the CC model over a random label order. Then, we present a theoretical analysis of the generalization error for the proposed generalized model. Based on our results, we propose a dynamic programming based classifier chain (CC-DP) algorithm to search the globally optimal label order for CC and a greedy classifier chain (CC-Greedy) algorithm to find a locally optimal CC.


On the Optimality of Classifier Chain for Multi-label Classification

Neural Information Processing Systems

To capture the interdependencies between labels in multi-label classification problems, classifier chain (CC) tries to take the multiple labels of each instance into account under a deterministic high-order Markov Chain model. Since its performance is sensitive to the choice of label order, the key issue is how to determine the optimal label order for CC. In this work, we first generalize the CC model over a random label order. Then, we present a theoretical analysis of the generalization error for the proposed generalized model. Based on our results, we propose a dynamic programming based classifier chain (CC-DP) algorithm to search the globally optimal label order for CC and a greedy classifier chain (CC-Greedy) algorithm to find a locally optimal CC. Comprehensive experiments on a number of real-world multi-label data sets from various domains demonstrate that our proposed CC-DP algorithm outperforms state-of-the-art approaches and the CC-Greedy algorithm achieves comparable prediction performance with CC-DP.


Cluster-Guided Label Generation in Extreme Multi-Label Classification

Jung, Taehee, Kim, Joo-Kyung, Lee, Sungjin, Kang, Dongyeop

arXiv.org Artificial Intelligence

For extreme multi-label classification (XMC), existing classification-based models poorly perform for tail labels and often ignore the semantic relations among labels, like treating "Wikipedia" and "Wiki" as independent and separate labels. In this paper, we cast XMC as a generation task (XLGen), where we benefit from pre-trained text-to-text models. However, generating labels from the extremely large label space is challenging without any constraints or guidance. We, therefore, propose to guide label generation using label cluster information to hierarchically generate lower-level labels. We also find that frequency-based label ordering and using decoding ensemble methods are critical factors for the improvements in XLGen. XLGen with cluster guidance significantly outperforms the classification and generation baselines on tail labels, and also generally improves the overall performance in four popular XMC benchmarks. In human evaluation, we also find XLGen generates unseen but plausible labels. Our code is now available at https://github.com/alexa/xlgen-eacl-2023.


On the Optimality of Classifier Chain for Multi-label Classification

Liu, Weiwei, Tsang, Ivor

Neural Information Processing Systems

To capture the interdependencies between labels in multi-label classification problems, classifier chain (CC) tries to take the multiple labels of each instance into account under a deterministic high-order Markov Chain model. Since its performance is sensitive to the choice of label order, the key issue is how to determine the optimal label order for CC. In this work, we first generalize the CC model over a random label order. Then, we present a theoretical analysis of the generalization error for the proposed generalized model. Based on our results, we propose a dynamic programming based classifier chain (CC-DP) algorithm to search the globally optimal label order for CC and a greedy classifier chain (CC-Greedy) algorithm to find a locally optimal CC.


Order-free Learning Alleviating Exposure Bias in Multi-label Classification

Tsai, Che-Ping, Lee, Hung-Yi

arXiv.org Machine Learning

Multi-label classification (MLC) assigns multiple labels to each sample. Prior studies show that MLC can be transformed to a sequence prediction problem with a recurrent neural network (RNN) decoder to model the label dependency. However, training a RNN decoder requires a predefined order of labels, which is not directly available in the MLC specification. Besides, RNN thus trained tends to overfit the label combinations in the training set and have difficulty generating unseen label sequences. In this paper, we propose a new framework for MLC which does not rely on a predefined label order and thus alleviates exposure bias. The experimental results on three multi-label classification benchmark datasets show that our method outperforms competitive baselines by a large margin. We also find the proposed approach has a higher probability of generating label combinations not seen during training than the baseline models. The result shows that the proposed approach has better generalization capability.


Bayesian Network Based Label Correlation Analysis For Multi-label Classifier Chain

Wang, Ran, Ye, Suhe, Li, Ke, Kwong, Sam

arXiv.org Machine Learning

Bayesian Network Based Label Correlation Analysis For Multi-label Classifier Chain Ran Wang 1,2, Suhe Ye 1,2, Ke Li 3 and Sam Kwong 4 1 College of Mathematics and Statistics, Shenzhen University, Shenzhen 518060, China. 2 Shenzhen Key Laboratory of Advanced Machine Learning and Applications, Shenzhen University, Shenzhen 518060, China. Abstract: Classifier chain (CC) is a multi-label learning approach that constructs a sequence of binary classifiers according to a label order. Each classifier in the sequence is responsible for predicting the relevance of one label. When training the classifier for a label, proceeding labels will be taken as extended features. If the extended features are highly correlated to the label, the performance will be improved, otherwise, the performance will not be influenced or even degraded. How to discover label correlation and determine the label order is critical for CC approach. This paper employs Bayesian network (BN) to model the label correlations and proposes a new BN-based CC method (BNCC). First, conditional entropy is used to describe the dependency relations among labels. Then, a BN is built up by taking nodes as labels and weights of edges as their dependency relations. A new scoring function is proposed to evaluate a BN structure, and a heuristic algorithm is introduced to optimize the BN. At last, by applying topological sorting on the nodes of the optimized BN, the label order for constructing CC model is derived. Experimental comparisons demonstrate the feasibility and effectiveness of the proposed method.


Order-Free RNN With Visual Attention for Multi-Label Classification

Chen, Shang-Fu (National Taiwan University) | Chen, Yi-Chen (National Taiwan University) | Yeh, Chih-Kuan (Carnegie Mellon University) | Wang, Yu-Chiang Frank (National Taiwan University)

AAAI Conferences

While a number of research works (Zhang and Zhou 2006; Nam et al. 2014; Gong et al. 2013; Wei et al. 2014; We propose a recurrent neural network (RNN) based model Wang et al. 2016) start to advance the CNN architecture for image multi-label classification. Our model uniquely integrates for multi-label classification, CNN-RNN (Wang et al. and learning of visual attention and Long Short 2016) embeds image and semantic structures by projecting Term Memory (LSTM) layers, which jointly learns the labels both features into a joint embedding space. By further of interest and their co-occurrences, while the associated utilizing the component of Long Short Term Memory image regions are visually attended. Different from existing (LSTM) (Hochreiter and Schmidhuber 1997), a recurrent approaches utilize either model in their network architectures, neural network (RNN) structure is introduced to memorize training of our model does not require predefined long-term label dependency. As a result, CNN-RNN exhibits label orders. Moreover, a robust inference process is introduced promising multi-label classification performance with crosslabel so that prediction errors would not propagate and thus correlation implicitly preserved.


Dynamic classifier chains for multi-label learning

Trajdos, Pawel, Kurzynski, Marek

arXiv.org Machine Learning

In this paper, we deal with the task of building a dynamic ensemble of chain classifiers for multi-label classification. To do so, we proposed two concepts of classifier chains algorithms that are able to change label order of the chain without rebuilding the entire model. Such modes allows anticipating the instance-specific chain order without a significant increase in computational burden. The proposed chain models are built using the Naive Bayes classifier and nearest neighbour approach as a base single-label classifiers. To take the benefits of the proposed algorithms, we developed a simple heuristic that allows the system to find relatively good label order. The heuristic sort labels according to the label-specific classification quality gained during the validation phase. The heuristic tries to minimise the phenomenon of error propagation in the chain. The experimental results showed that the proposed model based on Naive Bayes classifier the above-mentioned heuristic is an efficient tool for building dynamic chain classifiers.